home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
ftp.cs.arizona.edu
/
ftp.cs.arizona.edu.tar
/
ftp.cs.arizona.edu
/
icon
/
newsgrp
/
group00b.txt
/
000164_icon-group-sender_Tue Nov 28 07:56:52 2000.msg
< prev
next >
Wrap
Internet Message Format
|
2001-01-03
|
5KB
Return-Path: <icon-group-sender>
Received: (from root@localhost)
by baskerville.CS.Arizona.EDU (8.11.1/8.11.1) id eASEuAs12554
for icon-group-addresses; Tue, 28 Nov 2000 07:56:10 -0700 (MST)
Message-Id: <200011281456.eASEuAs12554@baskerville.CS.Arizona.EDU>
Date: Mon, 27 Nov 2000 22:45:07 -0600
From: gep2@terabites.com
Subject: Re: NLP Tools in ICON
To: jsampson@indexes.u-net.com, icon-group@cs.arizona.edu
Errors-To: icon-group-errors@cs.arizona.edu
Status: RO
Content-Length: 4121
> I asked in the Icon group mailing list about the Porter stemming algorithm
but as there was no response I wrote a very rough one of my own. The Porter
stemming algorithm is available on the Web in C, Perl and Java codings. For
my purposes (medical terms) I have not been very impressed with its
performance, at least in my implementation, but I am probably asking it to
do things it was not meant for.
In general, I don't think either SNOBOL4 or Icon are necessarily the languages
of choice if you have *REALLY* hard-core, full-bore number crunching that you
have to do. (If programming time is the bigger issue, then it's possible they
STILL are).
> I have been taking some interest in Snobol. It seems the company making the
SPITBOL implementation can ask a four-figure price for it, a fact which
intrigues me, when the more sophisticated language Icon is free.
True about the SPITBOL pricing, but the four-figure prices are really only if
you're going to get the mainframe or Unix server versions. If you're planning
to use it (like most of us) on a PC running under some flavor of either Windows
or OS/2, it's priced rather along the lines of other commercial compilers (and
MUCH less than, say, IBM's PL/1). The latest price for SPITBOL386 that I've
seen on the Catspaw Web page (http://www.snobol4.com) is $295. Which, yeah,
isn't being given away but OTOH if you really use the product very much (and I
use it CONSTANTLY) you ought to save enough time on **one** programming project
(quite possibly even on ONE PROGRAM) to make that cost back, and probably
several times over.
> ...It makes me think that must be where the action is, rather than with Icon.
However, there is not much going on in the Snobol mailing list either.
I think that BOTH lists are largely inhabited by people who are busier "doing"
than just "talking about doing", and therefore spend much of their time there in
"lurk mode". When interesting topics turn up (which does happen from time to
time) you can get quite an amazing flurry of activity indeed. :-)
Basically this is a tool which just simply WORKS, so there really isn't much
need to have a place to conduct a continual bitchfest or anything.
My own personal most recent project (and I actually used a combination of
SPITBOL386 and SNOBOL4+ for it) was developing programs to read and recover data
and source programs from a seriously garbled 1Gb hard drive... garbled badly
enough that the partition sector, boot sector, FAT and root directory were
essentially gone. I essentially had to write a SNOBOL4+ loadable function to
retrieve data at the raw sector level (and I wrote that in assembly language...
something I've done very little of in the last few years!), and a couple of
SNOBOL4+ programs to read and create binary disk image files (which can then be
read subsequently without constant and annoying error retry delays), along with
the ability to merge image files from several passes (since a lot of these read
errors, curiously, end up being transient). Next I wrote various SPITBOL
programs to pattern match within the binary image files to identify sectors
containing line-oriented text, non-delimited ASCII data (in this case, usually
FoxPro database sectors), FoxPro database headers, etc, and then within the
probable database sectors recovered to match data records of various crucial
types. (It's fairly rare for me to write programs for clients which crunch for
literally a day or more solid...).
Although I own copies of both SNOBOL4+ and SPITBOL386 (and Icon too of course,
although I admit I use S*BOL *far* more than I actually use Icon), and I
certainly enjoy the exhilarating speed of SPITBOL, I do write even new programs
using both compilers depending on the job I'm doing. SNOBOL4+ is still an
excellent product, and it's freeware.
Gordon Peterson
http://personal.terabites.com/
Support the Anti-SPAM Amendment! Join at http://www.cauce.org/
12/19/98: the day the Conservatives demonstrated their scorn for their
fraudulent sham of representative government. Voters, remember it!